Scaling and Universality in Continuous Length Combinatorial Optimization 
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We consider combinatorial optimization prob- 
lems defined over random ensembles, and study 
how solution cost increases when the optimal so- 
lution undergoes a small perturbation S. For 
the minimum spanning tree, the increase in cost 
scales as S 2 . For the mean-field and Euclidean 
minimum matching and traveling salesman prob- 
lems in dimension d > 2, the increase scales as 
<5 3 ; this is observed in Monte Carlo simulations in 
d = 2,3,4 and in theoretical analysis of a mean- 
field model. We speculate that the scaling ex- 
ponent could serve to classify combinatorial op- 
timization problems of this general kind into a 
small number of distinct categories, similar to 
universality classes in statistical physics. 

The interface of statistical physics, algorithmic the- 
ory, and mathematical probability is an active research 
field, containing diverse topics such as mixing times of 
Glauber-type dynamics (1] and many others), recon- 
struction of broadcast information , and probabilistic 
analysis of paradigm computational problems such as k- 
SAT 0, S H ■ I n this paper we introduce a new topic 
whose motivation is simpler than those. 

Freshman calculus tells us that, for a smooth function 
F : R — > M attaining its minimum at x* , for x near x* 
the relation between 6 = \x — x*\ and e = F(x) — F(x*) 
is e w ^F"(x*)5 2 . If instead we consider a function F : 
R d — > R on d-dimensional space, sophomore calculus tells 
us that similarly 

in£{F(x) - F(x*) : \x - x*\ = 6} w cS 2 

for appropriate c. So in a sense the scaling exponent 2 
is naturally associated with "smooth" or "regular" opti- 
mization problems. 

Now consider a graph-based combinatorial optimiza- 
tion problem, such as the traveling salesman problem 
(TSP): each feasible solution has n constituents (edges) 
and associated continuous costs (lengths), the sum of 
which gives the overall solution cost. Compare an ar- 
bitrary feasible solution x with the optimal (minimal) 
solution x* — unique, for generic lengths — by the two 
quantities 

5 n (x) = {number of edges in x but not in x* } /n 
e n (x) = {cost difference between x and x*}/s(n) 

where s(n) expresses the rate at which the optimal cost 
scales in n. Define e n (8) to be the minimum value of 



£ n (x) over all feasible solutions x for which <$ n (x) > 5. 
Although the function e n (S) will depend on n and the 
problem instance, we anticipate that for typical instances 
drawn from a suitable probability model it will converge 
in the n — * oo limit to some deterministic function e(S). 

The universality paradigm from statistical physics sug- 
gests there may be a scaling exponent a defined by 



e{6) - S a as S 







and that the exponent should be robust under model de- 
tails. In statistical physics, universality classes typically 
refer to critical exponents that characterize the behav- 
ior of measurable quantities both near and at a phase a 
phase transition. While a is not a critical exponent here 
— there is no phase transition — we suggest that it could 
play a similar role, categorizing combinatorial optimiza- 
tion problems into a small set of classes. If our analogy 
with freshman calculus is apposite, we expect that the 
simplest problems should have scaling exponent 2. 

This approach may seem obvious in retrospect, and fits 
within a long-standing tradition in the physical sciences 
(see "Discussion" later). However, it has never been pro- 
posed or explored explicitly. In this paper we report on 
three aspects of our program. For the minimum span- 
ning tree (MST), a classic "algorithmically easy" prob- 
lem solvable to optimality by greedy methods, we con- 
firm that the scaling exponent is indeed 2. We then turn 
to two harder problems: minimum matching (MM) and 
the TSP. Under a mean-field model, our new mathemat- 
ical analysis methods combined with numerics show that 
the scaling exponent is 3 for both MM and TSP, inde- 
pendent of the pseudo-dimension defined below. For the 
Euclidean model the exponent is 2 in the (essentially triv- 
ial) one-dimensional case, while Monte Carlo simulations 
suggest it is 3 in higher dimensions. 

Models 

In the Euclidean model we take n random points in a 
d-dimensional cube whose volume scales as n. Interpoint 
lengths are Euclidean distances. To reduce finite-size ef- 
fects, we take the space to have periodic (toroidal) bound- 
ary conditions when calculating the distances. 

In the mean-field or random link model we imagine n 
random points in some abstract space such that the Q) 
vertex pair lengths are i.i.d. random variables distributed 
as n x / d l, with probability density p(l) ~ for small I. 
Here < d < oo is the pseudo- dimension parameter and 



the distribution of small single interpoint lengths mimics 
that in the Euclidean model of corresponding dimension 
d, up to a proportionality constant. Both models are set 
up so that nearest-neighbor distances are of order 1 and 
the scaling of overall cost in the optimization problems 
is s(n) = n. 

A simple case: the MST 

For the MST, for any reasonable model of interpoint 
lengths — including the two models above — we expect 
a scaling exponent of 2. We will give a rigorous account 
elsewhere pj], but the underlying idea is simple. The 
classical greedy algorithm gives the following explicit in- 
clusion criterion for whether an edge e = (vi , v 2 ) of a 
graph belongs in the MST. Consider the subgraph con- 
taining edges between any two vertices within length t 
of each other. Let perc(e) < length(e) be the smallest t 
that keeps v\ and V2 within the same connected compo- 
nent. It is not difficult to see that e £ MST if and only 
if length(e) — perc(e). 

Given a probability model for n random points and 
their interpoint lengths, define a measure fJL n (x) on x £ 
(0, 00) in terms of the expectation 

fj, n (x) = —E |{ edges e : < length(e) — perc(e) < x}\ . 
n 

For any reasonable model we expect an n — > 00 limit 
measure fJ,(x), with a density v{x) = d[ijdx having a 
non-zero limit i/(0 + ) . 

Now modify the MST by adding an edge e with 
lcngth(e) — perc(e) = b, for some small b, to create a 
cycle; then delete the longest edge e' ^ e of that cycle, 
which necessarily has length(e') = perc(e). This gives a 
spanning tree containing exactly one edge not in the MST 
and having length greater by b. Repeat this procedure 
with every edge e for which < length(e) — perc(e) < j3, 
for some small (3. The number of such edges is nfi{0) ~ 
ni/(0 + )/3 to first order in /3, and as there is negligible 
overlap between cycles, each of the new edges will in- 
crease the tree length by ~ (3/2 on average. So 

5{p) ~ i/(0+)/J, e{fi) ~ K0+)/3 2 /2- 

This construction must yield essentially the minimum 
value of e for given S, so the scaling exponent is 2. 

Poisson Weighted Infinite Tree 

We now consider the minimum matching (MM) and trav- 
eling salesman problem (TSP). In MM, we ask for the 
minimum total length L n of n/2 edges matching n ran- 
dom points, and study the normalized limit expecta- 
tion linin^oo —E[L n ]. Taking the mean-field model with 
d — 1 for simplicity, the limit value 7r 2 /6 was obtained in 
using the replica method from statistical physics. We 
work in the framework of |8|, which rederives this limit 




A»... /IS.. 
0.1 0.3 0.5 0.6 



0.9 2.4 1.7 3.0 1.3 4.3 2.0 2.1 
/I / I / / 

(a) 



0.9 2.4 1.7 3.0 1.3 4.3 2.0 2.1 
/ / / I / 

(b) 



FIG. 1: Matching on a PWIT, (a) with and (b) without root 
node. Numbers represent edge weights (lengths). 



rigorously by doing calculations within an n = oo limit 
structure, the Poisson weighted infinite tree (PWIT). 

Briefly, the PWIT is an infinite degree rooted tree in 
which the edge weights (lengths) at each vertex are dis- 
tributed as the successive points < £1 < £2 < • ■ ■ of 
a Poisson process with a mean number x d of points in 
[0, x], i.e., a process with rate increasing as dx 1 . In 
this way, the PWIT corresponds to the mean-field model 
at a given d (see for review). 

Consider a matching on an instance of a rooted PWIT, 
as well as a matching on the same instance but with the 
root removed, as shown in Fig.^ Introduce the variable 

X = length of optimal matching on tree with root 
— length of optimal matching on tree without root. 

Both lengths are infinite, so this is interpreted as a limit 
of finite differences. If Xi is the analogous quantity for 
the ith constituent subtree of the rootless PWIT instance 
and £j the length of the root's ith edge, these variables 
satisfy the recursion 



X = min (& 

l<i<oo 



Xi). 



(1) 



Now take the to be the Poisson-distributed edge 
lengths and the {X^} to be independent random vari- 
ables from the same random process that produces X. 
Eq. Q is then a distributional equation for X, and can 
be shown Q for d = 1 to have as its unique solution the 
logistic distribution 



P(X < x) 



1 



1 



-00 < x < 00. 



(2) 



The PWIT structure further leads to the following in- 
clusion criterion. Consider an edge of length x in the 
tree, and the two subtrees formed by deleting that edge. 
The memoryless nature of the Poisson process allows us 
to consider each of these subtrees as independent copies 
of a PWIT, with their roots at the vertices of the deleted 
edge. It may be seen that including the edge in the op- 
timal matching incurs a cost of x — X\ — X2 , where X\ 
and X2 are the X variables as defined above, but for the 
two subtrees. Thus, an edge of length x is present in the 
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minimal matching if and only if 

x<Xt+X 2 . (3) 

The probability density function for edge lengths in 
the minimal matching is then 

f(x)=P(x<X 1 +X 2 ), 0<x<oo. 

Here X\ and X 2 are independent random variables dis- 
tributed according to Eq. J3J), from which the mean edge 
length can be calculated: 




x P{x <X 1 + X 2 ) dx = tt 2 /6. 



Mean-field MM and TSP 

The previous section summarized analysis from || ; now 
we continue with new analysis. To study scaling expo- 
nents, we introduce a parameter A > that plays the role 
of a Lagrange multiplier. Penalize edges used in the op- 
timal matching by adding A to their length. Let us study 
optimal solutions to the MM problem on this new penal- 
ized instance. Precisely, on a realization of the PWIT, 
define Y and Z as 

length of optimal matching on new tree with root 
— length of optimal matching on new tree without root 

where Y and Z differ in the definition of the edge lengths 
of the new tree: for Y, the edges penalized are those 
employed by the original rooted optimal matching; for Z, 
they are those employed by the original rootless optimal 
matching. 

For the penalized problem the recursion Eq. for X 
is supplemented by the following recursions for (X, Y, Z) 
jointly. Let i* be the value of i that minimizes £j — Xj. 
Then 

Y = min(& - {Z t + A) l(t = i*) - Y t l(i + »*)) 

i 

Z = min(£i - Yt) 

i 

where, as before, the {Y{\ and {Z{\ are independent ran- 
dom variables from the same random process producing 
Y and Z. 

Moreover, we get an inclusion criterion, analogous to 
Eq. : an edge of length x is included if and only if 

x + A < Z\ + Z 2 if edge used in optimal matching 

x < Yi + Y 2 if edge not used in optimal matching. 

In terms of the expected unique joint distribution for 
(X, Y, Z) , the quantities S and e that compare the penal- 
ized solution (as a non-optimal solution of the original 



problem) with the original optimal solution are 

/>oo 

$W = / P{cdge of length x is in optimal penalized 
Jo 

matching but not in optimal matching} dx 

roc 

= / P{X 1 + X 2 < x < Y 1 + Y 2 ) dx 
Jo 

and 

/•OO 

e(X) = x P{edge of length x is in optimal penalized 
Jo 

matching} dx — tt 2 /6 

/•OO 

= x [P(x <X X +X 2 , x<Z l +Z 2 ~ A) 
Jo 

+ P{X X + X 2 < x < Y ± + Y 2 )] dx - tt 2 /6. 

By the theory of Lagrange multipliers these functions 
e(X),S(X) determine e(S). We do not have explicit ana- 
lytic expressions analogous to Eq. for the joint distri- 
bution of (X, Y, Z) in terms of A. However, we can use 
routine bootstrap Monte Carlo simulations to simulate 
the distribution and thence estimate the functions (5(A) 
and g(A) numerically. And as indicated in sec. 6.2 
and HElm, the mean-field MM and the mean-field TSP 
can be studied using similar techniques; the TSP anal- 
ysis is just a minor variation of the MM analysis. For 
instance, recursion Eq. JQ) becomes 

X= min l2] (^-X t ) 

1<2<00 

where min^ denotes second-minimum. 

Table [I] reports numerical results showing good agree- 
ment with e oc S 3 in both problems for d — 1. These 
numerics are compatible with independent MM results 
obtained recently |l2(, as well as with our direct simula- 
tions on mean- field TSP instances at n — 512. The same 
exponent 3 arises for other d. 

Euclidean MM and TSP 

We consider the d = 1 case where the scaling exponent 
can be found exactly, and give numerical results for other 
cases. We restrict the discussion to the Euclidean TSP, 
although as for the mean-field model, MM is phenomeno- 
logically similar. 

Take the Euclidean TSP in d = 1, with periodic bound- 
ary conditions. The optimal tour here is trivial (with high 
probability a straight line of length n) but nevertheless 
instructive to analyze. As before, add a penalty term A 
to each edge used in the tour, and consider how the opti- 
mal tour changes in this new penalized instance. When A 
is small, changes to the tour will consist of "2-changes" 
shown in Fig. and will occur when an original edge 
length is less than A. A simple nearest-neighbor distance 
argument gives the distribution of edge lengths in the 
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FIG. 2: "2-change" schematic. Original optimal tour is shown 
by dashed line. New optimal tour on penalized instance is 
shown by solid line: over sufficiently short lengths, tour dou- 
bles back to avoid using penalized edges. 



original tour as p(l) 
in each 2-change, 



Since two edges are modified 



S(X) 



P (l)dl~2\, e(A) 



lp(l)dl ~ A 2 



The scaling exponent of 2 is not surprising, as the 1-d 
TSP behaves very similarly to the 1-d MST on penalized 
instances. Furthermore, it is consistent with the intuition 
that the "easiest" problems scale in this way. A similar 
argument applies to MM, and in both cases a more rig- 
orous analysis yields bounds on 6 and e that confirm the 
exponent. 

For d > 1, numerical results are shown in Fig. [3] 
These have been obtained by finding exact solutions 
to randomly generated n = 512 Euclidean instances in 
d — 2, 3, 4, using the Concorde TSP solver [l^. For each 
instance, the optimum was obtained on the original in- 
stance as well as on the instance penalized with a range of 
A values. For each A value, 8(A) and e(X) were averaged 
over the sample of instances. The resulting numerics are 
closely consistent with a scaling exponent of 3 (in spite 
of suffering from some finite-size effects at smaller 5), 
suggesting that the mean-field picture gives the correct 
exponent in all but the trivial 1-dimensional case. In the 
language of critical exponents, this would correspond to 
an "upper critical dimension" of 2. 

Discussion 



TABLE I: Scaling for mean-field MM and TSP in pseudo- 
dimension d — 1, obtained by simulating joint distribution of 
(X, Y, Z). Results show a good fit to £ ~ 2.3<5 3 and 2.0<5 3 . In 
more detail, 5 scales as A 1 / 2 while e scales as A 3 ^ 2 . Estimates 
for e have standard deviation about 0.001 for MM and 0.003 
for TSP. 
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FIG. 3: Scaling for Euclidean TSP in d = 2, 3, 4, based on 
exact solutions for 100 instances in each case. Data points 
correspond to A values from 0.004 to 0.05. Slopes of best- 
fit lines vary from 2.94 to 3.24. Standard deviation is about 
3 x 10~ 3 for 6 and 3 x 10~ 5 for e. 



The goal of our scaling study has been to address a new 
kind of problem in the theory of algorithms, using con- 
cepts from statistical physics. Traditionally, work on the 
TSP within the theory of algorithms [l4( has emphasized 
algorithmic performance, rather than the kinds of ques- 
tions we ask here. Rigorous study of the Euclidean TSP 
model within mathematical probability [lij has yielded 
a surprising amount of qualitative information: existence 
of an n — ► oo limit constant giving the mean edge-length 
in the optimal TSP tour 16], and large deviation bounds 
for the probability that the total tour length differs sub- 
stantially from its mean ^t|- However, calculation of 
explicit constants in dimensions d > 2 seems beyond the 
reach of analytic techniques. For the mean-field bipar- 
tite MM problem, impressive recent work 0, 0] has 
proven an exact formula giving the expectation of the 
finite-n minimum total matching length, though such ex- 
act methods seem unlikely to be widely feasible. 

On the other hand, there has been significant progress 
over the past twenty years in the use of statistical physics 
techniques on combinatorial optimization problems in 
general. Finding optimal solutions to these problems is a 
direct analog to determining ground states in statistical 
physics models of disordered systems |2fJ. This observa- 
tion has motivated the development of such approaches 
as simulated annealing |2l|, the replica method j2^ and 
the cavity method Condensed matter physics, and 
particularly models arising in spin glass theory, has pro- 
vided a powerful means to study algorithmic problems; 
at the same time, algorithmic results have implications 
for the associated physical models. It is instructive to 
consider our work in that context. 

Researchers in the physical sciences have long been in- 
terested in the low-temperature thermodynamics [U 
of disordered systems, investigating properties of near- 
optimal states in spin glass models. Our procedure for 
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studying near-optimal solutions by way of a penalty pa- 
rameter is similar to a method, known as e-coupling |2J, 
l25j |. used for calculating low-energy excitations in spin 
glasses. Making use of this method, physicists have ob- 
tained quantities closely analogous to our scaling expo- 
nents for models of RNA folding |25| ■ Furthermore, in the 
last year independent work |l J| has explored e-coupling 
on MM, numerically identifying a different but related 
scaling exponent. 

For the TSP, analytical and numerical studies have 
been performed over fifteen years ago m, |53 on the 
thermodynamics of the model, with overlap quantities 
calculated for near-optimal solutions. The results have 
suggested that at low temperature T, the cost excess e 
scales as T 2 while the average fraction of differing edges 
between solutions (1 — q, with q being the "overlap frac- 
tion") scales as T. This leads to e ~ (1 — q) 2 , in appar- 
ent contradiction with our exponent of 3. However, at 
low temperatures, q represents overlaps between typical 
near-optimal solutions, whereas our 5 measures overlaps 
between a near-optimal solution and the optimum. The 
different definitions of these two quantities could account 
for the discrepancy in scaling exponent: it is not surpris- 
ing that 1 — q grows faster than 5 as one considers so- 
lutions of increasing cost. At the same time, a possible 
implication of these results is that at low temperature, 
S ~ T 2 / 3 . We are not aware of any direct theoretical 
arguments to explain this, and consider it an intriguing 
open question. 

It is also important to note that the underlying prop- 
erty 8 — > as e — > cannot always be taken for granted. 
This property is called asymptotic essential uniqueness 
(AEU) AEU requires, among other things, that the 
optimum itself be unique. In principal, even if it is not, 
one could still analyze near-optimal scaling by consider- 
ing sufficiently local perturbations from a given optimum. 
It is natural to expect the resulting exponent to be inde- 
pendent of the specific optimum chosen. However, this 
may not be true in the event of what statistical physicists 
call replica symmetry breaking (RSB) AEU is a 

special case of replica symmetry, so while RSB implies the 
absence of AEU, the absence of AEU does not necessar- 
ily imply RSB. A current debate in the condensed matter 
literature concerns whether or not low-temperature spin 
glasses display RSB It is generally believed that 

RSB is incompatible with unique non-zero values of var- 
ious scaling exponents. Thus, the correct approach to 
analyzing near-optimal scaling in such problems remains 
another open question. 

One final example may serve to illustrate the di- 
versity of possible applications for our type of scaling 
analysis, as well as an instance where the absence of 
AEU is surmountable. In oriented percolation on the 
two-dimensional lattice, there are independent random 
traversal times on each oriented (up or right) edge. The 
percolation time T n is the minimum, over all ( 2 ") paths 



from (0,0) to (n,n), of the time to traverse the path. 
So (2n)^ 1 E[T n \ — > t* , a time constant. It is elementary 
that there will be near-optimal paths, with lengths T' n 
such that n^ 1 (ETl l — ET n ) — > and which are almost 
disjoint from the optimal path. So our e(S) analysis ap- 
plied to paths will not be useful: even with a unique 
optimum, AEU will not hold. But we can rephrase the 
problem in terms of flows. A flow on the n x n oriented 
torus assigns to each edge a flow of size e [0,1], such that 
at each vertex, in-flow equals out-flow. Let t(S) be the 
minimum, over all flows with mean flow-per-edge = <5, 
of the flow-weighted average edge traversal time. In the 
n — > oo limit, one can show that as 5 — > 0, t(5) — > t* 
where t* is the same limiting constant as before. We 
therefore expect a scaling exponent t(S) — t* ~ S a . Mean 
field analysis [2^| gives a = 2, and Monte Carlo study of 
the d = 2 case is in progress. 

Conclusions 

We have studied the scaling of the relative cost difference 
e between optimal and near-optimal solutions to combi- 
natorial optimization problems, as a function of the so- 
lution's relative distance 6 from optimality. This kind 
of scaling study, although well accepted in theoretical 
physics, is new to combinatorial optimization problems. 
For the MST, we have found e ~ S 2 . For the MM and 
TSP, in the 1-d Euclidean case e ~ 8 2 as well, while in 
both the mean-field model and higher Euclidean dimen- 
sions e ~ S 3 . 

The scaling exponent may categorize combinatorial op- 
timization problems into a small number of classes. The 
fact that MST is solvable by a simple greedy algorithm, 
and that the 1-d case of the MM and TSP is essentially 
trivial, suggests that a scaling exponent of 2 character- 
izes problems of very low complexity. The exponent of 
3 characterizes problems that are algorithmically more 
difficult. Of course, this is a different kind of classifica- 
tion from traditional notions of computational complex- 
ity: MM is solvable to optimality in 0(n 3 ) time whereas 
the TSP is NP-hard. Rather, these exponent classes are 
reminiscent of universality classes in statistical physics, 
which unite diverse physical systems exhibiting identical 
behavior near phase transitions. 

A key question in the study of critical phenomena is 
whether mean-field models correctly describe phase tran- 
sition behavior in the geometric models they approxi- 
mate. The TSP and MM do not involve critical behav- 
ior, but the fact that mean-field and geometric scaling 
exponents coincide for d > 2 is significant. It provides 
evidence that in a combinatorial setting, the mean-field 
approach can give a valuable and accurate description of 
the structure of near-optimal solutions. 
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